Studies of model selection and regularization for generalization in neural networks with applications
نویسنده
چکیده
This thesis investigates the generalization problem in artificial neural networks, attacking it from two major approaches: regularization and model selection. On the regularization side, under the framework of Kullback–Leibler divergence for feedforward neural networks, we develop a new formula for the regularization parameter in Gaussian density kernel estimation based on available training data sets. Experiments show that the estimated regularization parameter is valid for most cases. With the derived formula, all sample data sets can be used to estimate the smoothing parameter, which is less computationally expensive than using the leave-one-out cross-validation method. Furthermore, the new covariance matrix estimation formula is suitable for small sample data with high dimension setting in the regularized Gaussian classifier case. On the model selection side, both theory and extensive experiments are conducted for investigating the Bayesian Ying-Yang (BYY) model selection criterion to determine the cluster number in small sample-size cases. We derive new formula for estimating the smoothing parameters with proper approximations in the Smoothed ExpectationMaximum (SEM) technique. Experimental results show that with improved mixture model parameters, the BYY model selection criterion performance is enhanced. From the model selection viewpoint, generalization can also be improved by combining several nets to form ensemble networks. The relationship between Mixture of Experts (ME) and the ensemble networks is established in this thesis. In an approximation where ME reduces to ensemble neural nets, the ensemble nets can be globally optimized instead of being individual members. A new method is consequently proposed to average
منابع مشابه
A Review of Epidemic Forecasting Using Artificial Neural Networks
Background and aims: Since accurate forecasts help inform decisions for preventive health-careintervention and epidemic control, this goal can only be achieved by making use of appropriatetechniques and methodologies. As much as forecast precision is important, methods and modelselection procedures are critical to forecast precision. This study aimed at providing an overview o...
متن کاملOn Comparison of Adaptive Regularization Methods
Modeling with exible models, such as neural networks, requires careful control of the model complexity and generalization ability of the resulting model which nds expression in the ubiquitous bias-variance dilemma [4]. Regularization is a tool for optimizing the model structure reducing variance at the expense of introducing extra bias. The overall objective of adaptive regularization is to tun...
متن کاملNetwork information criterion-determining the number of hidden units for an artificial neural network model
The problem of model selection, or determination of the number of hidden units, can be approached statistically, by generalizing Akaike's information criterion (AIC) to be applicable to unfaithful (i.e., unrealizable) models with general loss criteria including regularization terms. The relation between the training error and the generalization error is studied in terms of the number of the tra...
متن کاملNovel Radial Basis Function Neural Networks based on Probabilistic Evolutionary and Gaussian Mixture Model for Satellites Optimum Selection
In this study, two novel learning algorithms have been applied on Radial Basis Function Neural Network (RBFNN) to approximate the functions with high non-linear order. The Probabilistic Evolutionary (PE) and Gaussian Mixture Model (GMM) techniques are proposed to significantly minimize the error functions. The main idea is concerning the various strategies to optimize the procedure of Gradient ...
متن کاملطراحی و آموزش شبکه های عصبی مصنوعی به وسیله استراتژی تکاملی با جمعیت های موازی
Application of artificial neural networks (ANN) in areas such as classification of images and audio signals shows the ability of this artificial intelligence technique for solving practical problems. Construction and training of ANNs is usually a time-consuming and hard process. A suitable neural model must be able to learn the training data and also have the generalization ability. In this pap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002